Job Tracking on a Grid – the Logging and Bookkeeping and Job Provenance Services1
نویسندگان
چکیده
Keeping track of a job within a complex Grid environment is a difficult task that cannot be easily delegated to inspection of data from Grid infrastructure monitoring. The job centric monitoring service is used to provide data about actual job status independently on jobs crossing internal administrative domains. It is also a valuable source of data for system administrators helping to improve the infrastructure behavior. The Logging and Bookkeeping (L&B) developed within the EGEE project provides a distributed scalable job centric monitoring service able to deal with hundreds of thousands of jobs on large Grids. To provide the necessary scalability and not to slow down the processing of jobs within a middleware, the service is based on a non-blocking asynchronous model. This means that the order of events sent to L&B by individual parts of the middleware (user interface, scheduler, computing element, etc.) is not guaranteed. A robust on-the-fly processing is used to derive a meaningful job state from events arriving in random order. The L&B may thus temporarily provide information that looks inconsistent with the knowledge user has from some other source (for example, he got an independent notification about the job state). The report provides details of the L&B internal design, and the way of correct interpretation of the L&B results (the job state) is also discussed. While L&B is dealing with active jobs only, the Job Provenance (JP) is designed to permanently store information about all jobs that run on a Grid. All the relevant information, including computing environment specification and basic input data, needed to re-submit the job in the same environment is stored andmade available for a later perusal. Users can annotate stored records, providing yet another metadata layer useful, for example, for job grouping and data mining over the JP. As the data are never
منابع مشابه
A New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability
Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...
متن کاملImproving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...
متن کاملImproving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...
متن کاملProduction Services for Information and Monitoring in the Grid
R-GMA is a realization of the Grid Monitoring Architecture (GMA) that also exploits the power of the relational data model and the SQL query language. The biggest challenge during the development of R-GMA was to ensure that it could be scaled to operate in a large grid reliably. The system is being used in areas as diverse as resource discovery, job logging and bookkeeping, network monitoring a...
متن کاملA General Purpose Suite for Job Management, Bookkeeping, and Grid Submission
This paper briefly presents the prototype of a software framework permitting different multi-disciplinary user communities to take advantage of the power of the Grid computing. The idea behind the project is to offer a software infrastructure allowing an easy, quick and customizable access to the Grid to research groups or organizations that need to simulate big amount of data.
متن کامل